Polynomial-Delay and Polynomial-Space Algorithms for Mining Closed Sequences, Graphs, and Pictures in Accessible Set Systems

نویسندگان

  • Hiroki Arimura
  • Takeaki Uno
چکیده

In this paper, we study efficient closed pattern mining in a general framework of set systems, which are families of subsets ordered by set-inclusion with a certain structure, proposed by Boley, Horváth, Poigné, Wrobel (PKDD’07 and MLG’07). By modeling semi-structured data such as sequences, graphs, and pictures in a set system, we systematically study efficient mining of closed patterns. For a class of accessible set systems with a tree-like structure, we present an efficient depth-first search algorithm that finds all closed sets in accessible set systems without duplicates in polynomial-delay and polynomial-space w.r.t. the total input size using efficient oracles for the membership test and the closure computation for the pattern class. From the above results, we show that the closed pattern mining problems are efficiently solvable both in time and space for the following classes: convex hulls, picture patterns in 2-D planes, maximal bi-cliques, closed relational graphs, closed patterns for rigid motifs with wildcards.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Closed Pattern Mining in Strongly Accessible Set Systems

Many problems in data mining can be viewed as a special case of the problem of enumerating the closed elements of an independence system with respect to some specific closure operator. Motivated by real-world applications, e.g., in track mining, we consider a generalization of this problem to strongly accessible set systems and arbitrary closure operators. For this more general problem setting,...

متن کامل

A Closed Frequent Subgraph Mining Algorithm in Unique Edge Label Graphs

Problems such as closed frequent subset mining, itemset mining, and connected tree mining can be solved in a polynomial delay. However, the problem of mining closed frequent connected subgraphs is a problem that requires an exponential time. In this paper, we present ECE-CloseSG, an algorithm for finding closed frequent unique edge label subgraphs. ECE-CloseSG uses a search space pruning and ap...

متن کامل

Tenacity and some other Parameters of Interval Graphs can be computed in polynomial time

In general, computation of graph vulnerability parameters is NP-complete. In past, some algorithms were introduced to prove that computation of toughness, scattering number, integrity and weighted integrity parameters of interval graphs are polynomial. In this paper, two different vulnerability parameters of graphs, tenacity and rupture degree are defined. In general, computing the tenacity o...

متن کامل

ON THE EDGE COVER POLYNOMIAL OF CERTAIN GRAPHS

Let $G$ be a simple graph of order $n$ and size $m$.The edge covering of $G$ is a set of edges such that every vertex of $G$ is incident to at least one edge of the set. The edge cover polynomial of $G$ is the polynomial$E(G,x)=sum_{i=rho(G)}^{m} e(G,i) x^{i}$,where $e(G,i)$ is the number of edge coverings of $G$ of size $i$, and$rho(G)$ is the edge covering number of $G$. In this paper we stud...

متن کامل

Time and Space Efficient Discovery of Maximal Geometric Graphs

A geometric graph is a labeled graph whose vertices are points in the 2D plane with an isomorphism invariant under geometric transformations such as translation, rotation, and scaling. While Kuramochi and Karypis (ICDM2002) extensively studied the frequent pattern mining problem for geometric subgraphs, the maximal graph mining has not been considered so far. In this paper, we study the maximal...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009